Local Caching
Learn about local caching in Python.
We'll cover the following
Local caching has the advantage of being (very) fast, as it does not require access to a remote cache over the network. Usually, caching works by storing cached data under a key that identifies it. This technique makes the Python dictionary the most obvious data structure for implementing a caching mechanism.
A simple cache using Python dictionary#
Obviously, such a simple cache has a few drawbacks. First, its size is unbound, which means it can grow to a substantial size that can fill up the entire system memory. That would result in the death of either the process, or even the whole operating system in a worst-case scenario.
Therefore, a policy must be implemented to expire some items out of any cache, in order to be sure that the data store does not grow out of control. There are a few algorithms that can be found that are pretty simple to implement such as the following:
- Least recently used (LRU) removes the least recently used item first. This means the last access time for each item must also be stored.
- Least Frequently Used (LFU) removes the least frequently used items first. This means the number of accesses of each item must be stored.
- Time-to-live (TTL) based removes any entry that is older than a certain period of time. This has the benefit of automatically invalidating the cache after a certain amount of time, whereas the LRU and LFU policies are only access-based.
Methods like LRU and LFU make more sense for memoization (see Memoization). Other methods, such as TTL, are more commonly used for a locally stored copy of remote data.
cachetools#
The cachetools Python package provides implementations of all those algorithms, and it is pretty easy to use as shown in the following example. There should be no need to implement this kind of cache yourself.
Run the above commands yourself in the terminal provided below by clicking on the terminal.
The cachetools.LRUCache class provides an implementation of an LRU cache
mechanism. The maximum size is set to three in this example, so as soon as a fourth item is added to the cache, the least recently used one is discarded.
To run the application below, click on the Run button and enter command
python cachetools-ttl.py.
/
In the above example, a demo program uses cachetools to cache a Web page for five
seconds, wherein up to five pages are cached for five seconds. When it runs, this program prints the page every second, but it only refreshes it every five seconds.
The TTLCache class accepts a different definition of time; if needed, you can customize it to not count the time in seconds but in any other time unit that you
would like (iteration, pings, requests, etc.).
Introduction to Caching
Memoization